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Abstract 



o 

^ . Lower dimensional signal representation schemes frequently assume that the signal of interest lies in a single 



vector space. In the context of the recently developed theory of compressive sensing (CS), it is often assumed that 

the signal of interest is sparse in an orthonormal basis. However, in many practical applications, this requirement 

may be too restrictive. A generalization of the standard sparsity assumption is that the signal lies in a union of 

subspaces. Recovery of such signals from a small number of samples has been studied recently in several works. 

Here, we consider the problem of subspace detection in which our goal is to identify the subspace (from the union) 

O ' in which the signal lies from a small number of samples, in the presence of noise. More specifically, we derive 

performance bounds and conditions under which reliable subspace detection is guaranteed using maximum likelihood 

(ML) detection. We begin by treating general unions and then specify the results to the special case in which the 

qq ■ subspaces have structure leading to block sparsity. In our analysis, we treat both general sampling operators and 

iq ' random sampling matrices. With general unions, we show that under certain conditions, the number of measurements 

■<^j- ' required for reliable subspace detection in the presence of noise via ML is less than that implied using the restricted 

O ■ 

*y-^ isometry property which guarantees signal recovery. In the special case of block sparse signals, we quantify the 

gain achievable over standard sparsity in subspace detection. Our results also strengthen existing results on sparsity 

pattern recovery in the presence of noise under the standard sparsity model. 

X' 

Index terms- Maximum likelihood detection, union of linear subspaces, linear sampling, compressive 
sensing, random projections, block sparsity pattern recovery 

I. Introduction 

The compressive sensing (CS) framework has established that a small number of measurements acquired via 
random projections are sufficient for signal recovery when the signal of interest is sparse in a certain basis. Consider 
a length- N signal x which can be represented in a basis V such that x = Vc. The signal x is said to be /c-sparse 
in the basis V if c has only k non zero coefficients where k is much smaller than N. It has been shown in lfl"l-l3l 
that 0{k\og{N/k)) compressive measurements are sufficient to recover x when the measurements are random. 
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Signal recovery can be performed via optimization or greedy based approaches. A detailed overview of CS can be 
found in BMI51. 

There are a variety of applications in which complete signal recovery is not necessary. The problem of sparsity 
pattern recovery (or the locations of non zero coefficients of the sparse signal) arises in a wide variety of areas 
including source localization |[T4l . 031, sparse approximation lfT6l . subset selection in linear regression iTTTl . lfl"8l . 
estimation of frequency band locations in cognitive radio networks |[T9l - ll2Tl . and signal denoising ll22l . In these 
applications, often finding the sparsity pattern of the signal is more important than approximating the signal itself. 
Further, in the CS framework, once the locations of the non zero coefficients are identified, the signal can be 
estimated using standard techniques. Performance limits on reliable recovery of the sparsity pattern of sparse signals 
with noisy compressive measurements have been derived by several authors in recent work exploiting information 
theoretic tools Il23l - ll30l . Most of these works focus on deriving necessary and sufficient conditions for reliable 
sparsity pattern recovery, and quantify the gap between the performance limits of existing computationally tractable 
practical algorithms and what can be achieved based on the optimal Maximum Likelihood (ML) detector when the 
computational constraints are removed. 

Although initial work on CS focused on signals which are sparse in a certain basis, there are practical scenarios 
where structured properties of the signal are available. Reduced dimensional signal processing with several signal 
models which go beyond simple sparsity was discussed in recent literature ll3TI - ll36l . One general model that can 
describe many structured problems is that of a union of subspaces. In this setting, the signal is known to lie in one out 
of a possible set of subspaces but the specific subspace chosen is unknown. Examples include wideband spectrum 
sensing |[20l and signals having finite rate innovation (371, EH- With this additional information, more efficient 
algorithms can be developed compared to that for the traditional CS framework. Conditions under which stable 
sampling and recovery is possible in a general union of subspaces model are derived in 1021 , ||34l . Signal recovery 
with structured union of subspaces was considered in ||3TI . 11331 . However, the problem of subspace detection has 
not been treated in this more general setting. 

In this paper, our goal is to investigate subspace detection from the union of subspaces model with a given 
sampling operator. We consider subspace recovery based on the optimal ML decoding scheme in the presence of 
noise and derive performance in terms of probability of error when sampling is performed via an arbitrary linear 
sampling operator. Based on an upper bound on the probability of error, we derive the minimum number of samples 
required for asymptotic reliable detection of subspaces in terms of a signal to noise (SNR) measure, the dimension 
of each subspace in the union and a term which quantifies the dependence among the subspaces. In the special case 
where sampling is performed via random projections and the subspaces in the union have a specific structure such 
that each subspace is a sum of some other k$ subspaces, we obtain a more explicit expression for the minimum 
number of measurements. This number depends on the number of underlying subspaces, the dimension of each 
subspace, and the minimum non zero block SNR (defined in Section IIV.BI) . We note that the conventional sparsity 
model is a special case of this structure. 



The asymptotic probability of error of the ML detector in the presence of noise for the standard sparsity model 
was first investigated in ll23l followed by several other authors IT241 . ll25l . ll30l . In ll23l . sufficient conditions were 
derived on the number of noisy compressive measurements needed to achieve a vanishing probability of error 
asymptotically for sparsity pattern recovery while necessary conditions were considered in ||25l . The analyses in 
both papers are based on the assumption that the sampling operator is random. Here, we follow a similar direction 
while deriving performance metrics for subspace detection with the union of subspaces model. However, there are 
key differences between our derivations and that in ||23l . First, we treat arbitrary (not necessarily random) sampling 
operators and assume a general union of subspaces model as opposed to standard sparsity. Further, the results in 
ll23l were derived based on weak bounds on the probability of error, thus there is a gap between those results and 
the number of measurements required for the exact probability of error to vanish asymptotically at finite SNR. Here, 
we consider tighter bounds on the probability error, thus even for the conventional CS framework, we present tighter 
results. In particular, while the results of ll23l require k + (c\ + 2048) max < log ( ( j~ ) J , ^g NIi \ measurements 
we have shown that only k+ CS £ 2 R (log(JV— k)) measurements are needed for reliable asymptotic sparsity pattern 
recovery with the standard sparsity model where N, k, CSNR m i n are the signal dimension, sparsity index and the 
minimum component SNR of the signal, respectively and c\ and C2 are constants. 

We further illustrate numerically, the performance gap in terms of the probability of error bounds for the ML 
detector and the probability of error for computationally tractable algorithms (e.g., OMP) used for subspace detection 
with the union of subspace models. Numerical results show that the derived bound for the probability error is fairly 
close to the exact probability of error of the ML detector obtained via simulations especially in the high SNR 
region. 

The rest of the paper is organized as follows. In Section [EI] the problem of subspace detection from the union 
of subspace model is introduced. In Section [TTTl performance limits with ML detection in terms of the probability 
of error and conditions under which asymptotic reliable subspace detection in the presence of noise is guaranteed 
are derived with a given linear sampling operator considering a general union of subspaces model. The results 
are extended in Section [TV] to the setting where structured properties of the subspaces in the union are available. 
Sufficient conditions for subspace recovery from the union of subspaces model when sampling is performed via 
random projections are also derived in Section [TV] In Section |VJ we discuss some practical algorithms to detect 
subspaces in the union of subspace model and present numerical results to validate the theoretical claims. 

Throughout the paper, we use the following notation. Arbitrary vectors in a Hilbert space %, are denoted by 
lower case letters, e.g., x. Calligraphic letters, e.g., S, are used to represent subspaces in %. Vectors in W N are 
written in boldface lower case letters, e.g. x. Scalars (in R) are also denoted by lower case letters, e.g., x, when 
there is no confusion. Matrices are written in boldface upper case letters, e.g., A. Linear operators and a set of basis 
vectors for a given subspace S are denoted by upper case letters, e.g., A. x ~ A/" (/a, E) denotes that the random 
vector x is distributed as multivariate Gaussian with mean [i and the covariance matrix S. x ~ <^(A) denotes that 
the random variable x is distributed as Chi squared with the degrees of freedom m and the non centrality parameter 



A. (The central Chi squared distribution is denoted by X%A. is a vector with appropriate dimension in which all 
elements are zeros. The conjugate transpose of a matrix A is denoted by A*. I& denotes the identity matrix of 
size k. 1 1 . 1 1 2 denotes the I2 norm and | . | uses to denote both the cardinality (of a set) and the absolute value (of a 

t 2 

scalar). Special functions used in the paper are listed below: Gaussian Q-function: Q(x) = -h== J^° e~^dt, Gamma 
function: T(x) = f^° t x ~ 1 e~ t dt, modified Bessel function with real arguments: K v {x) = J °° e~ xcosht cosh(vt)dt. 

II. Problem Formulation and Mathematical Model 



As discussed in H3T1I — H341I . there are many practical scenarios where the signals of interest lie in a union of 
subspaces instead of a single subspace. 

II.A Union of subspaces 

Definition II. 1. Union of subspaces: A signal x G % lies in a union of subspaces if x G X where X is defined as 

x = y Si (i) 

i 
and Si's are subspaces ofrl. A signal x G X if and only if there exists iq such that x £ <Sj . 

Let T < 00 denote the number subspaces in the union X. Let Vi = {vi m } r ^ l=1 be a basis for the finite dimensional 
subspace Si where k{ is the dimension of Si. Then each x G Si can be expressed in terms of a basis expansion 

ki-l 

X = ^2 c i( m ) v im 
m=0 

where Cj(m)'s for m = 0, 1, • • • fcj_i are the coefficients corresponding to the basis Vi. We assume that the subspaces 
are disjoint (i.e. there are no subspaces such that Si C Sj for i ^ j in the union CQ)) and each subspace Si is 
uniquely determined by the basis Vi. 

II. B Observation model: Linear sampling 

Consider a sampling operator via a bounded linear mapping of a signal x that lives in an ambient Hilbert space 
%. Let the linear sampling operator A be specified by a set of unique sampling vectors {a m }^f =1 . With these 
notations, noisy samples are given by, 

y = Ax + w (2) 

where y is the M X 1 measurement vector, and the m-fh element of the vector Ax is given by, (Ax) m = (x,a m ) 
for m = 0, 1, ■ ■ ■ , M — 1 where (.) denotes the inner product. The noise vector w is assumed to be Gaussian with 
mean and covariance matrix a^I^j. 

When x £ Si for some i in the model CQ), the vector of samples can be equivalently represented in the form of 
a matrix vector multiplication, 

y = BjCj + w (3) 



where 



B, = AV t 



(a ,Vio) (ao,vn} . . (a ,fj( fcl -i)) 

(ai,Uj ) (ai,i>ji) . . (ai,Vj(fei-i)) 



\ (aM-i,v i0 ) (a M -i,vn) . . (aM-i,Vi(ki-i)) J 
and Cj = [cj(0) Cj(l) • • • Cj(A;j — 1)] T is the coefficient vector with respect to the basis Vi. Further, let bj m denote 
the rn-th column vector of the matrix Bj for m = 0, 1, • • • ,h — l and i = 0, 1, • • • , T— 1. We assume that the linear 
sampling operator A is a one-to-one mapping between X and AY. Since {vio, ■ ■ ■ , Uj(fc,_i)} is a set of linearly 
independent basis vectors, then {bjo, • • • , bj^-i)} are a l so linearly independent for each i = 0, 1, ■ ■ ■ , T — 1. 

The sparsity model used in the Compressive Sensing (CS) literature is a special case of this general union of 
subspaces model. In the standard CS framework, it is assumed that a length- N signal of interest x is /c-sparse in an 
orthonormal basis V so that x can be represented as x = Vc with c having only k <C N significant coefficients. 
This corresponds to assuming that x lies in a union of subspaces where each subspace is A;-dimensional and has 
as a basis k vectors from the orthonormal basis V. In this case, there are T = ( fc ) such subspaces in the union. 

//. C Subspace detection from the union of subspaces model 

As discussed in the Introduction, there are applications where it is sufficient to detect the subspace in which 
the signal of interest lies from the union of subspaces model CQ) instead of complete signal recovery. Moreover, 
if there is a procedure to correctly identify the subspace with vanishing probability of error, then the signal x can 
be reconstructed with a good li norm error using standard techniques. However, the other way may not be always 
true, i.e., if an algorithm developed for complete signal recovery is used for subspace detection, it may not give an 
equivalent performance guarantee. This is because, even if such an estimate of the signal may be close to the true 
signal with respect to the considered performance metric (e.g., li norm error), the subspace in which the estimated 
signal lies may be different from the true subspace. This can happen especially when the SNR is not sufficiently 
high. Thus, investigating the problem of subspace detection is important and is the main focus of this paper. 

For the problem of subspace detection, the performance metrics used to evaluate the quality of the estimate 
are different from those used for exact signal recovery. In this paper, we consider subspace detection via the ML 
detector and performance is evaluated via the probability of error which will be defined in the next section. More 
specifically, our goal is to address the following issues. 

• Performance of the optimal subspace detection scheme (ML detection) in terms of the probability of error in 
detecting the subspaces from the union of subspaces model CQ in the presence of noise. We are also interested 
in conditions under which reliable subspace recovery in the union is guaranteed with a given sampling operator. 

• How much gain in terms of the number of samples required for subspace detection can be achieved if further 
information on structures is available for the subspaces in £Q) compared to the case when no additional 
structured information is available (i.e. compared to the standard sparsity model used in CS). 



• Illustration of the performance gap between the ML detector and computationally tractable algorithms for 
subspace detection from the union of subspaces model at finite SNR. 

II. D Main results 

The main results of the paper can be summarized as follows. 

With the general union of subspaces model as defined in CQ), the minimum number of samples required for 
asymptotic reliable detection of subspaces in the presence of noise is 

M>t+ 7(^k) log ™ <4) 

where k is the dimension of each subspace, f(SNR) is a monotonically increasing function of SNR and To is a 
measure of the number of subspaces in the union with maximum dependence where Tq < T (formal definitions of 
all these terms are given in Section Hill) . In the special case where each subspace in the union dTJ can be expressed 
as a sum of fco subspaces out of L where each such subspace is d-dimensional such that k = k$d, the problem 
of subspace detection reduces to the problem of block sparsity pattern recovery. In that case, when the sampling 
operator is represented by random projections, the number of samples required for asymptotic reliable block sparsity 
pattern recovery is given by 

where BSNR m - m is the minimum non zero block SNR and c 2 is a constant. 

For the general union of subspaces model, the authors in ll34l derived lower bounds on the minimum number of 
measurements required for the sampling matrix to satisfy the restricted isometry property (RIP). Similar results are 
derived in |f33l when the subspaces in the union have a specific structure leading to block sparsity as considered 
in Section [TV] Based on their results, the dominant factor of the minimum number of samples required for the 
sampling matrix to satisfy RIP scales as 

M>7 ? ifc + 7 ?2 log(r) (6) 

with the general union of subspaces model |34] and 

r] 3 k + r] 4 k log(L/k ) (7) 



with the block sparse model 11331 for some constants r/, for i = 1,2, 3, 4. Thus, with these numbers of measurements, 
signals which lie in the union of subspaces can be recovered using practical algorithms with high probability when 
there is no noise. In contrast, our results provide sufficient conditions for subspace detection (but not complete 
signal recovery) with the optimal ML detector. Our results show that, when the subspaces in the union are such 
that Tq <C T and we are operating in a finite SNR region, the minimum number of samples required for asymptotic 
subspace detection is much less than that predicted in 041 for signal recovery. 



In the case of the standard sparsity model, our results show that 

m > k + csmi~ (log(Ar " k)) (8) 

measurements are required for reliable sparsity pattern recovery where N = Ld and CSNR m \ n {< ^ ° ) is 
the minimum component SNR. Thus, from ((5]) and ((8]), we observe that the number of measurements required for 
asymptotic subspace recovery beyond the sparsity index (i.e., M — k) reduces approximately d times with a block 
sparsity model compared to that is required with standard sparsity pattern recovery. A detailed comparison between 
our results and existing results in the literature is presented in Sections JII] and [TV] 

From the numerical results, we will see that the derived upper bound on the probability of error of the ML 
detector is a tight bound for the exact probability of error obtained via simulations. Further, it is observed that 
existing computationally tractable algorithms for subspace detection show a considerable performance gap compared 
to what can be achieved via computationally expensive ML detection. 

III. Subspace Detection With General Unions 

We assume that the knowledge of the sampling operator A and the bases for each subspace is available at the 
detector. The problem of finding the true subspace from the observation model © via the ML detector becomes 
finding the index i such that, 

i= argmax p(y|Bj). 

j=0,-,T-l 

Given that x G Si for some i, and using the observation model d3j, we have p(y|Bj) = A/"(BjCj, o^Ijtf). Since the 
coefficient vector Cj is not known, the ML detector estimates Cj such that p(y|Bj) is maximized. The ML detector 
chooses the subspace Si over Sj for i 7^ j if, 

maxp(y|Bj) > maxp(y|Bj). 

The ML estimator of Cj can be found as, Cj = (B*Bj) _1 B*y resulting in 



1 \ 1 



log(maxp(y|Bi)) = log ( — — ^ ) - — ^Hy - Pjy||| 



w 



log( (^|F^J"^ l|Pj±y|li 



where Pj = Bj(B*B.j) X B* is the orthogonal projector onto the span of {b im }^ =1 , Pf = I - P.; and B* denotes 
the Hermitian transpose of the matrix B. Thus, the ML detector detects the subspace Si over Sj for i 7^ j if, 

||Ph^-||P;-y||!<0. 
The performance of the ML detector is characterized by the probability of error which is given by, 

P e = Pr{B estimated / B true ) = ^Pr(i / i\Bi)Pr(Bi) 

i 

< Y^2 Pr ^ = { \ B = B i) pr ( B = b j) ( 9 ) 

i j 



where 

Pr(i = i|B = Bj) = Pr(||P+y||l - ||PJ-y||| < 0). 
Let A(y) = ||P^~y||2 — HP^ylli- When the true subspace is Sj, we have ||P^~y||2 = ||Pi~ w ll2 an ^ 

P^y = P^B i c J +P 4 ± w = P^(B iV c iV + w) (10) 

where BjwCjw = Yl ^>jmCj{m) and TZ(A) denotes the range space of the matrix A. More specifically, the 

b ]m ^(B.) 

M x I matrix Bju contains the columns of Bj which are not in the range space of the matrix Bj where I is the 
number of such columns in Bj. The I x 1 vector Cj\j contains the elements of Cj corresponding to the column 
vectors in Bjw. 

We conclude that A (y) = ||P^(B v c- v +w)|||-||P+w||i and Pr(A(y) < 0) = Pr ( l|P ^ ( ^l Cj ff w)l1 ' < 1 



i j\i^j\i^ w )\\2-\\ r j w ll2 aiiu-r/^^y; <. u) - ri y prr^ 

When Bj is given, the random variable g\ = ||P^-(BjwCjw + w)^/^ is a non-central Chi squared random 
variable with M — ki degrees of freedom and non-centrality parameter ||P^-(BjwCj\ JUl/o"^. The random variable 
Q2 = HPfwIII/cr^ is a (central) Chi-squared random variable with M — kj degrees of freedom. The two random 
variables g± and g<i are, in general, correlated and the computation of the exact value of Pr (A(y) < 0) is difficult. 
In the following we find an upper bound for the quantity Pr (A(y) < 0) following techniques similar to those 
proposed in ||23l . 

III.A Upper bound on Pr (A(y) < 0) 

We assume ki = k for i = 0, 1, • • • ,T — 1. For clarity, we further introduce the following notations. Let 
Wj\i be the set consisting of column indices of Bj such that bj. m ^ 7£(Bj) for m = 0, 1, • • • k — 1 and i ^ j 
(i,j, = 0, 1, • • • , T — 1). We then have that \Wj\ j| = I where / can take values from 1, 2, • • • , k. 

Lemma III.l. Assume that the sampling operator A is known. Given that the true subspace is Sj, the probability 
of error in selecting the subspace Si over Sj, Pr (A(y) < 0), is upper bounded by, 

Pr(A(y) < 0) < Q Q(l - 2c )^/A^) + * (*, A Ai ) (11) 

where Xj\i = ^-||P- L Bj\ i Cj\j||| ^(l,\j\i) = wr^) (^J\i) l/2 ~ 1/2K l/2-i/2 ( £2 r li )> Q(-) « the Gaussian Q 
function, T(.) is the Gamma function, K v {x) is the modified Bessel function, and < cq < \. 

Proof: See Appendix A. ■ 

Theorem 1. When the sampling operator is given, the average probability of error of the ML detector for subspace 
detection over the general union of subspaces ([7]) with T subspaces is upper bounded by, 

p ^fJ2T,Q U(! - 2c o)^v) + * M iV ) (12) 

assuming that the probability of each subspace in the union ([7]) is uniform where Xj\i, cq, Q(-), ^(., .) are as 
defined in Lemma \I11.1\ 



Proof: The proof follows from Lemma IIII.ll and ©. ■ 

III.B Evaluation of Aj\j 

From Lemma UlI. 1 1 the terms A^w and I can be considered as measures which determine how close the subspace 
Si is to subspace Sj when the true subspace is Sj for i ^ j and i,j = 0, 1, • • • ,T — 1. We assume that the 
subspaces 5, and Sj are not necessarily linearly independent; i.e. there can be elements in Sj which are also in 
Si. Let Wj\i contain the column indices of Bj which are not in 7£(Bj) and |Wju| = I for any i ^ j where I takes 
values from 1, 2, • • • , k. As I increases, the dependency of the two subspaces decreases resulting in more separable 
subspaces. In the special case where the subspaces Sj and Si are linearly independent, we have I = k. Thus, I 
can be considered as a measure of dependence between any two subspaces Sj and Si for i ^ j in the union £Q). 
For given /, it can be seen that the probability Pr(A(y) < 0) in (TTTT > monotonically decreases as Aj-w increases 
which implies that the larger the value of Ajw, the better the separation of two subspaces Sj and Si for j ^ i. It 
is, therefore, of interest to further investigate the quantity A^u. 

As defined in Lemma IIII.ll A^\ j is given by, 

\ _ l ||T3-Lp> „ ||2 

If the true subspace is assumed to be <Sj, then the quantity ||P^ Bj-wCajHI (= | |F*^-BjCj| || = HP^-AxHl) denotes 
the energy of the sampled signal Ax projected onto the null space of B^; i.e., the energy of the sampled signal which 
is unaccounted for by Si for i ^ j. Therefore, when ||P^-BjwCj\j||2 is large, the probability that the subspace Si 
is selected as the true subspace becomes small. Further, for ||P^ BjwCjX ^ 1 1 § to be zero, we have to have, Sj C Si 
which cannot be true based on our assumptions. Thus, Xju > 0. 

Let the eigendecomposition of Pf- be P^ = QjAjQ^ where Qj is a unitary matrix consisting of eigenvectors 
of P^- and Aj is a diagonal matrix in which the diagonal elements represent eigenvalues of Pf- which are M — k 
ones and k zeros. Then, for given /, 

X jV = T2-H p ^ B iV c ivll2 = X] a m,i(0 >i M ~ k)o? min>l 

where a m ^(l) = ^-(q m j, B^wCj-w) for given I, q mj j is the m-fh eigenvector of P^-, Qi is the set containing indices 
corresponding to non zero eigenvalues where \Qi\ = M — k and a m - m j = min|a mi j(Z)|. 

Note that (M — ^)a^ in ; measures the minimum SNR of the sampled signal, Ax, projected onto the null space 
of any subspace Si for i 7^ j, i = 0, 1, • • • , T — 1 such that |Waj| = I given that the true subspace in which the 
signal lies is Sj. 

For a given subspace Sj, define Tj(l) to be the number of subspaces Si such that |VVj\j| = I. With these notations, 
the probability of error in (PT2i can be further upper bounded by, 

T— 1 k 

P ^fJ2Il T iW [Q (V 1 - 2co)^(M-fc)aO + * (I, (M - k)a 2 min A (13) 

1=0 1=1 ^ ^ ' ' 



10 



where tf (/, (M - fe)< ia J = ^y(co(M - k)a 2 min ^ 2 - l / 2 K l/2 ^ /2 (c (M - k)a 2 mi ^/2). To obtain <E]> we 
used the facts that Q(x) is monotonically non increasing in x and Vl/(s,x) is monotonically non increasing in x 
for given s when x > 0. The quantity Tj(Z) is a measure of the dependency among the subspaces in the union. To 
compute Tj(l) explicitly, the specific structures of the subspaces should be known. For example, in the standard 
sparsity model used in CS in which the union in CO consists of T = ( fe ) subspaces from an orthonormal basis V 
of dimension N, there are (*) ( N J k ) number of sets such that \Wj\i\ = I, thus Tj(l) = (*) ( N J k ). In that particular 
case Tj(l) is the same for all j = 0, 1, • • • , T — 1. To further upper bound the probability of error of ML detector 

in (fT3l with an arbitrary union of subspaces model, we let To (I) = max Tj(l). Then (IT3T > is upper bounded 

j=0,l,— ,T—1 

by, 

P e < J2 T °« (Q (V ~ 2co)^(M-£0< n ^ + * (Z, (M - fc^i)) • (14) 

Theorem 2. Lef To(Z) a«cf a mmi ^ e as defined in Subsection \III.Bl Suppose that the sampling is performed via 
any linear sampling operator A. Then we have lim P e = if the following condition is satisfied: 

(M-fc)->oo 

M > fc + max{Mi,M 2 } 

where M x = ; max^ [h{l) = (1 _ 2co ^ aLn[ {Iog(T (0) + log(l/2)}}, 

M 2 = j= maxJ/ 2 (l) = 2 ^+?^ 1} {log(T (0) + log (^) } }, < c < 1/2, 6 = ^ W r > 0. 

Proof: See Appendix B. ■ 

Let £j G {1, • • • , A;} be the value of Z which maximizes /j(Z) as defined in Theorem |2] for i = 1, 2. For M 2 , it 

can be verified that we can find constants en and rn in the defined regimes such that -n — %—-vz > if k is 

u u & (1 — 2c ) 2 r a c 

fairly small. Then the dominant factor of M\ and M 2 can be written in the form of -^r- log (To) where a^ in and 
To are the corresponding values of a^in i anc ^ ^b(0 when Z = Zo for Zo G {Zi, Z 2 } and ci is an appropriate constant. 
Since, most of the scenarios we are interested in are for the case where k is sufficiently small, we get the minimum 
number of samples required for reliable subspace detection which is on the order of k + -^r- log (To). It is further 
noted that To(Z) < T for all I and thus To < T where T is the total number of subspaces in the union (0Q|. 

In the following, we compare the results in Theorem |2] with some existing results for general union of subspaces 
model (Q]). It has been shown in ll34l that the following conditions should be satisfied by the sampling matrix which 
guarantees the reliable recovery of the sparse signal based on the observation model d3): 



Theorem 3 ( (341 )• For any given t > 0, if 

M>^(log(2T) + klog(^p\+t) (15) 

then, the matrix B satisfies the restricted isometry property (RIP) with the restricted isometry constant 5 (for formal 
definition of RIP readers may refer to H34\l ). 

From the right hand side of (TT5l . it can be seen that the dominant factor is on the order of r\\k + r/ 2 log(T) for 
some constant r\\ and ?? 2 and thus, with that many samples, the signal x can be recovered using a practical algorithm. 



II 

However, in our work which specifically focuses on subspace recovery (not signal recovery) in the presence of 
noise, we showed that the minimum number of measurements required for reliable subspace recovery with ML 
detector scales as k + /vg^m log(To) where To < T and f(SNR) is a monotonically increasing function of SNR. 
If the subspaces in the union are such that Tq << T, then we can see that, subspace recovery from the union 
of subspace model requires much less measurements (with the ML detector) at a given SNR compared to that is 
predicted in P4l . For example, when the signal of interest has the standard sparsity model with T = (, ) where 
N is the signal dimension, T can be written in the form of (, ) ( ,~ ) where Iq can take a value from 1, 2, • • • , k. 
Thus in this case we then have log To is considerably smaller than logT especially when k is not very small. In 
the worst case where Tq ss T, it can be seen that the same order of measurements are required by the ML detector 
in the presence of noise to what is predicted in ||34l . However, the exact number of measurements greatly depends 
on the measures related to SNR and other constants. 

III.C Random sampling 

Next, we consider the special case where the sampling operator is a M X N matrix in which the elements are 
realizations of a random variable (e.g. Gaussian). Then we have Bj = AV, in © where A is the sampling matrix 
Vj = [vjoI • • • Ivj^.x)] is the N x k matrix in which columns consist of the basis vectors of the subspace Si for 
i = 0, 1, ■ ■ ■ , T — 1. The only term which depends on the sampling operator in the expression for the upper bound 
on the probability of error in (fT2l is Aj\ j. When the sampling operator is a random projection matrix, Xj\ i with 
the general union of subspaces model CQ) can be evaluated as shown in Proposition IIII.U 

Proposition III.l. Consider that the sampling matrix A consists of elements drawn from a Gaussian ensemble 
with mean zero and variance 1. When M — k is sufficiently large, we may approximate Xju as 

X j\i^ — (M-fc) ^ \Wj m Cj(m)\\l 

° w m6W A , 

where as defined before, Wj\i (I = |Wjw|J denotes the set consisting of indices of basis vectors in Sj which are 
not in <S,-. 



Proof: We rewrite Aju = 4j-||Pj Bjuc.-v j|||. The t-th element of the vector B,-uc,-\j can be written 






(at, J2 v j m Cj(m)) where a 4 's are row vectors of A for t = 0, 1, • • • ,M — 1. Assuming that the elements 

m£W ]A , 

of A are independent Gaussian with mean zero and variance 1, it can be easily seen that (a t , ^ Vj m Cj(m)) 
is a realization of a Gaussian random variable with mean zero and variance || ^ Vj m Cj(m) |||. Further, the 
elements of Bj\ jCjv , are independent of each other since aj's are independent for t = 0, 1, • • • , M — 1. Thus, the 
random vector BjwCju ~ A/"(0, || Yl ^jmCjirnjW^lM)- With given realizations, consider again the transformation 
Qf Bj-uCj-w where Qi is the unitary matrix with eigenvectors of P^~. Since the elements in Bj-wCjV j are independent 
and identically distributed (iid), the unitary transformation does not change the distribution of BjxjCjU. Then 
IIP^Bj-wCjXjH^ = ||AjQ^Bj\jCj\j||| is a sum of M — k iid random variables. Thus when (M — A;) is sufficiently 
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large, invoking the law of large numbers, we may approximate ||P^BjwCj\j||2 — > (M — k)\\ J2 v jm c j( rn )ll2 

meW A , 

which completes the proof. ■ 

It is noted that the quantity Yl Vj m Cj(m) is the portion of the original signal x that is unaccounted for by the 

subspace Si when the true subspace is Sj for j ^ i. Let a^ ainl = -4- min || Yl v jmCj(m)\\2 be the minimum 
(over i,j = 0, 1, ■ ■ ■ , T — 1) SNR of the original signal x which is unaccounted for by the subspace Si when the 
true subspace is Sj such that \Wj\\ = I for j / i. Then, with random sampling, the upper bound on the probability 
of error of the ML detector in (fLTi reduces to (PT4l after replacing a^ in t in (fl4l) by a^ in i- It is worth mentioning 
that a^, in i in ([14)) is a measure of SNR after sampling while a^ in i is a measure of SNR before sampling the 
signal. 

IV. Subspace Detection from Structured Union of Sub spaces 

Although the general union of subspaces model in CQ) is applicable for many applications, there are certain 
scenarios in which the signals can be assumed to lie in more structured union of subspaces as considered in ll33l . 

m, 121. 



IV.A Block sparsity 

Next, we consider the case where each subspace in the union (Q~|) has additional structure as considered in 
|[33l , |[39l . Under this structure, each subspace is represented as a sum of k$ (out of L) disjoint subspaces. More 
specifically, 

Si= © Vj (16) 

i£2 fcf) 

where {Vj} • ~ s are disjoint subspaces, and Sfc contains ko indices from {0, 1, • • • , L — 1}. Let dj = dim(Vj) 

and TV = Yj=o dj- Then there are T = ( fc ) subspaces in the union. Under this formulation, the dimension of each 

subspace in the union CQ) is k = J2 dj. In the special case where dj = d for all j, k = kod. 

je£ fc0 
Defining Vj as a basis for Vj, a signal in the union can be written as 

ies fc0 
where Cj = [cj(0), • • • , Cj(dj — 1)] T G M. d: > is a dj x 1 coefficient vector corresponding to the basis Vj. Let V be a 
matrix constructed by concatenating V^'s column wise, such that V = [Vq\Vi\- • • |Vl-i] and c be a N x 1 vector with 
c = [cj| • • • |c£_J T . As defined in ll33l . the vector c G 1^ is called block A;-sparse over I = {do, di, • • • , di_i} 
if all the elements in c, are zeros for all but k$ indices where N = Yj=o dj- Then y can be written in the form of 

y = AVc + w = Bc + w (18) 

where B = AV is a M x N matrix and c is a block /co-sparse vector which has L blocks in which all but ko 
are zeros. When dj = d for all j, N = Ld. With this specific structure, the subspace recovery problem reduces 
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to finding the indices of blocks in c such that the elements inside that block are non zero, i.e., the problem of 
finding the block sparsity pattern. In addition to the structured union of subspaces model considered here in which 
the block sparsity pattern is observed, there are other instances where block sparsity arises such as in multiband 
signals |40l , and in measurements of gene expression levels |[35l BTI . 
Define the support set of the block sparse signal c as 

U:={i€{Q,l,--- ,L-l}\a^Q} 

which consists of the indices of the subspaces in the sum in (fTTl ) or the indices of the non zero blocks. With the 
above formulation, there are T = (, ) number of such support sets and the j-th support set is denoted by Uj for 
j = 0,l,--- ,T-1. 

Given that the true block support set is Uj, the measurement vector in (fT8l l can be written as, 

y = BjCj + w 

where Bj = AVj, Vj = [V u o | • • • \V * -i] where u 1 ^ denotes the m-th index in the set Uj for m = 0, 1, • ■ • , ko — 1. 
Similar interpretation holds for the vector Cj. To compute the minimum number of samples required for reliable 
asymptotic subspace recovery with this structured union of subspaces model based on ML detector, we can follow 
a similar approach as in Theorem |2] with appropriate notation changes. In this case, we can explicitly find To (I) 
required in the Theorem [2] More specifically, for given I, there are ( ; c ) ( ~i c ) number of sets such that \Uj\i\ = I 
for any given Uj. Then Tj(l) = Tq(1) = ( ,°) ( "7 °). In the next section, we explicitly provide the lower bounds 
on the minimum number of samples required for reliable subspace recovery with ML detector (based on Theorem 
12) with this special structure for the union when the sampling operator is represented by random projections. 

IV.B Sampling via random projections 

It has been shown in (331, EU, J39l that block sparse signals can be reliably recovered when the linear operator 
is a random projection operator as considered in the traditional CS measurement framework as long as the sampling 
matrix satisfies block-RIP. We assume that the signal of interest x is a N x 1 vector and the sampling operator is 
a M x N matrix with random elements. Further, assume that the N x N basis matrix V defined in Section IIV.AI 
is orthonormal. 

When the sampling operator is a M x N random matrix A, the block sparse observation model in (fT8l l, can be 
rewritten as, 

y = AVc + w = Be + w (19) 

where V is a N x N orthonormal matrix, c is a block sparse signal with ko non zero blocks each of length d and 
elements in A are drawn from a random ensemble. 
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Compared to the analysis in Subsection IIII.CI with general unions when the sampling operator is a random 
projection matrix, with the block sparsity model, we can further simplify the expression obtained for Aj\ j in 
Proposition IIII.ll For a block sparse signal x, we define the minimum non zero block SNR as following: 

Definition IV.2. The minimum non zero block SNR is defined as BSNR m ; n = min l|C 2 2 where U is the set 
containing the indices corresponding to non zero blocks of the block sparse signal as defined in Section \IV.A\ 

Proposition IV.2. Let BSNR m i n be the minimum non zero block SNR of a block sparse signal. When the matrix 
A consists of elements drawn from a Gaussian ensemble with mean zero and variance 1, for any TAj and Ui with 
I = \l4j\A we have, 

1 i_1 

Xj\i = —{M - k d) J2 HVu^CumJIl > (M - k d)lBSNR min 

where u^ denotes the m-th index of the set Uj\i which contains the indices of the subspaces in Uj which are not 
in Ui. 

Proof: Proof follows from Proposition [III. 1 1 and the following results: 

l-i i-i i-i 

m=0 m=0 m=0 

l-l 

= y^ (V u m c u ™ , V„™ c„m ) + >(V,,™ c,,"> ,V„t c„t ) 

Z—/ N S\i U j\»' U i\i U 3\i' Z_^ N U i\i U j\i' U i\i U jV' 

l-l 

= Eh v ^ c «aJI2 ( 2 °) 

m=0 

where the last equality is due to the fact that the columns of V are orthogonal. Then (|20l i is lower bounded by, 

i-i 
|| J2 V^CumjH > cr^BSNR min 

m=0 

which completes the proof. 



Corollary IV. 1. When the sampling operator is a random projection matrix where the elements are drawn from 
a Gaussian ensemble with mean zero and the variance 1, the upper bound on the probability of error of the ML 



detector in A12\) for block sparsity pattern recovery reduces to, 

Pe<J2 (?) ( L ~i k °) [Q fy 1 ~ 2c ) N /(M-fc)ZBSNR min ) + * (Z,BSNR min )) (21) 

where k = k/d, *(/,BSNR min ) = ^^(^(M - k d)lBSNR min ) l / 2 - l / 2 K l/2 _ l/2 (c (M - k d)lBSNR mhl /2) 
and < c < 1/2. 

Next, we investigate the sufficient conditions which state how the number of samples M scales with the other 
parameters (L, ko,d, BSNRmin) to ensure the probability of error in (ETl vanishes asymptotically with the block 
sparse model (fT9l ). 
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Lemma IV.2. When (M — /c)BSNR m j n — > oo, the probability of error of the ML detector A21\) in recovering the 
block sparsity pattern vanishes asymptotically if the following conditions are satisfied: 

M > k + max{Mi , M 2 } (22) 

where M x = BSNR ^ 2coY (log(L - k ) + log (^)), 

- 4(fc /2 + r - 1) ( 1 ,.,1. /26 e 2 \\ 



vv/f/i < cq < o, tq > a«<f 6q = -^r^- are constants. 



Proof: Proof follows from Theorem |2] and using the relations, that ( ,°) < ( , °) for ho < L/2, and 

log(( L f°))<^og( £i ^ M )- 

■ 

From Lemma IIV.2I we can write the minimum number of random samples required for reliable block sparsity 
pattern recovery asymptotically in the form of 0(k + BS ^ r 1 R log(L — feo)) for some constant c\ in the case where 
ho is sufficiently small. 

Remarks 1. When BSNR m { n — >■ c», M > k measurements are sufficient for asymptotic reliable block sparsity 
pattern detection with the ML detector. 

IV. C Discussion 

IV.C.l Revisiting the standard sparsity model: In the standard sparsity model considered widely in the CS 
literature, the subspaces in the union £0 are assumed to be /c-dimensional subspaces of an orthonormal basis. With 
the notations used in Sections |IV.A| and |IV.B[ we can represent the standard sparsity model as having k = k d non 
zeros of the sparse signal c and V is the N x N basis matrix with N = Ld in (fT9l ). To have a fair comparison to 
the performance of the ML detector in the presence of noise with the standard sparsity model and block sparsity 
model, we introduce further notations. Define the minimum component SNR, CSNR m ; n = min Il c ™wll2 

meU,i=0,-,d-l CT ™ 

so that BSNR m i n > dCSNR m i n . Then, when the sampling is performed via random projections, the probability 
of error of the ML detector with the standard sparsity model is upper bounded by, 

P ^ E (f) [ ~i k ) (q Q(! - 2co)V(M-fc)*CSNR min ) + * (I, CSNR min )J (24) 

where N = Ld, k = k d and ^ (I, CSNR min ) is as defined in Corollary II V. 1 1 With these notations, the probability 
of error of the ML detector with block sparsity model (|2T1 can be rewritten as 

Pe<J^ ( h °) ( L ~ h °) (Q Q(l - 2c )V(M-fe)ZdCSNR min ) + * (I, dCSNR^)) . (25) 

Based on d24l and (|25T ). it can be shown that the dominant part of the required number of random samples for reliable 
subspace detection asymptotically in the presence of noise can be expressed in the form of 0{k + \ csNR ^°s(L — 



16 

ko)) with block sparsity model and (D(k + CS ^ 2 R (log(N — k))) with the standard sparsity model where c\ and 
c-2 are positive constants. Thus, the required number of random samples (in terms of M — k) for reliable subspace 
detection with random projections is reduced by approximately a factor of d with the block sparsity model compared 
to that with the standard sparsity model where k is the total number of non zero coefficients of the block/standard 
sparse signal. Further, it is noted that the above analysis is for the worst case, i.e. the upper bounds on the probability 
of error are obtained considering the minimum block/component SNR. The actual number of measurements required 
for reliable subspace detection can be less than that predicted in Lemma IIV.2I 

IV.C.2 Existing results with standard sparsity model: In the most related existing work on deriving sufficient 
conditions for the ML detector to succeed in the presence of noise with the standard sparsity model (in (23]), the 
results are derived based on the following bound on the probability of error: 

p ^\-f k \f N - k \ A j (M-k)lCSNR min \ 

When CSNR m - m — > oo, it can be easily seen that this upper bound is bounded away from zero (i.e. it is bounded 
by 4e~ (M ~ fc)/64 (( I f) - l) > 0). Based on the upper bound (|26]>, it was shown in (23l that 



N fc )).^2= 1 °^ fl ^ \ 



'-linn 



measurements are required for asymptotic reliable sparsity pattern recovery where c\ is a constant (which is different 
from the one used earlier in the paper). With this, when the minimum component SNR, CSNR m i n — > oo, the ML 
detector requires k + {c\ + 2048)/clog((iV — k)/k) measurements for asymptotic reliable recovery, which is much 
larger than k. However, as shown in ll25l . 1421 . when the measurement noise power is negligible (or in the no noise 
case), the exhaustive search detector is capable of recovering the sparsity pattern with M = k + 1 measurements 
with high probability. Thus, the limits predicted by the existing results in the literature for sparsity pattern recovery 
in terms of the minimum number of measurements show a gap between those limits and what is actually required. 
On the other hand, our results show that when CSNR m - m — > oo, the upper bound on the probability of error in 
d24l ) vanishes with the standard sparsity model when M > k. More specifically, when CSNR m i n — > oo, our results 
show that O(k) measurements are sufficient for reliable asymptotic sparsity pattern recovery with the ML detector 
which is intuitive. Further, at finite CSNR m - m , when M2 dominates Mi in (1271 ) the lower bound on the minimum 
number of samples required for asymptotic reliable sparsity pattern recovery obtained in 11231 has the same scaling 
with respect to L, k, d and CSNR m - m to that is obtained in this paper with the standard sparsity model. 

IV.C.3 Comparison with existing results for block sparsity pattern recovery: The problem of stable recovery 
of block sparse signals is discussed in 11331 , |J35l , ||39l . When the samples are acquired via random projections 
(elements in A are Gaussian) with the notations used in Section IIV.B1 the minimum number of samples required 
for the sampling matrix to satisfy block RIP with high probability is given by (from Theorem [3] and (35)) 

m ^H 2 (£)) + «°<tH ™ 
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for some t > and < 5 < 1 is the restricted isometry constant. This is roughly on the order of ^k+r/^ko log(L/ko) 
for some positive constants 773 and 7/4. Thus, block sparse signals can be reliably recovered using computationally 
tractable algorithms (e.g. extension of BP - mixed h/h norm recovery algorithms) with 773/c + 774^0 log(L//co) 
measurements when there is no measurement noise. In the presence of noise, BP based algorithm developed in ||33l 
is shown to be robust and it can tolerate noise in a way that ensures that the norm of the recovery error is bounded by 
the noise level. With the analysis presented in Section IIV.BI for block sparsity pattern recovery with ML detector in 
the presence of measurement noise, it requires roughly the order of k + (di/BSNR m - m ) log(L — k$) measurements 
(when kg is fairly small) for reliable block sparsity pattern recovery where c\ is a positive constant. Here, the 
second term is significant at finite BSNR m i n while it vanishes when BSNR mm — > 00. At finite BSNR m - m , when 
ko is sublinear w.r.t. L, it can be shown that A;o log(L/A;o) >> log(£ — ko). Thus, in that region of k®, the relevant 
scaling obtained in (|28T ) is larger than what is required by the optimal ML detector derived in this paper at finite 
BSNR m i n . The exact difference between them depends on the value of BSNR m - m and the relevant constants. 

V. Numerical Results 

Several computationally tractable algorithms for sparsity pattern recovery with standard sparsity have been derived 
and discussed quite extensively in the literature. Extensions for such algorithms for model based or structured 
CS have also been considered in several recent works. For example, extensions of CoSamp and iterative hard 
thresholding algorithms for model based CS were considered in ||3T1 . Extensions of OMP algorithm for block 
sparsity pattern recovery (BOMP) were considered in ||35l , ||43l while ||33l , fi4l . P31 considered the Group Lasso 
algorithm for block sparse signal recovery. 

Our goal in this section is to validate the tightness of the derived upper bounds on the probability of error of the 
ML detector for subspace recovery with the union of subspace models and provide numerical results to illustrate 
the performance gap when employing practical algorithms for subspace recovery. It is noted that simulating the 
ML algorithm is difficult due to its high computational complexity in the high dimensions. For the structured union 
of subspaces model considered in Section IIV.A1 the problem reduces to detecting the block sparsity pattern of a 
block sparse signal. The performance of the ML algorithm is compared to block-OMP as proposed in ||35l which 
is provided in Algorithm [T] where the set IA contains the estimated indices of the non zero blocks of block sparse 
signal. 

Results in both Figures \T\ and |2j are based on the special structure as considered in (fT6T > for subspaces leading 
to block sparsity and the sampling operator is assumed to be a random matrix in which elements are drawn from 
a Gaussian ensemble with mean zero and variance 1. Further, we let N x N matrix V be the standard canonical 
basis. In Fig [TJ the exact probability of error of the ML detector (obtained via simulation) and the upper bound on 
the probability of error derived in d2Tl vs M/N are shown. In the block sparsity model, we let N = 50, d = 2, 
L = 25, BSNR m i n = 13dB and three different plots correspond to ko = 3, 4, 5. The exact probability of error of 
the ML detector is obtained via Monte Carlo simulations with 10 5 runs. In the upper bound (1211 . we let cq = 1/4. 
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Algorithm 1 Block-OMP (B-OMP) for sparsity pattern detection 



1) Initialize t = 1, U(0) = 0, residual vector ro = y 

2) Find the index A(i) such that X(t) = arg max | |Bjrt— 1 1 12 

i=l,-,N 

3) Set U{t) = U(t - 1) U {A(t)} 

4) Compute the projection operator P(t) = B(U(t)) (B(ti(t)) T B(ti(t))\ B(U(t)) T . Update the residual 
vector: Tit = (I — P(*))y/ (note: B(U(t)) denotes the submatrix of B in which columns are taken from B 
corresponding to the indices in U{t)) 

5) Increment t = t + 1 and go to step 2 if t < k, otherwise, stop 



N=50, L=25, d-2, BSNR . ■ 1 3 dB 



Marked lines: Exact prob. of error 
Plain lines: Upper bound on the 
prob. of error 




Fig. 1. Exact probability of error and the derived upper bound on the probability of error of the ML detector for block sparsity pattern 
recovery; N = 50, L = 25, d = 2, and thus k = 10, BSNR min = 1MB 



It can be seen from Fig. \T\ that the derived upper bound on the probability of error is a tight bound on the exact 
probability of error especially as M/N increases. 

In Figure |2j the performance of the block sparsity pattern recovery with ML and B-OMP algorithms is shown 
when BSNR min varies. In Figure H we let k = 5, L = 25, d = 2 and N = 50. For B-OMP, 10 4 runs are 
performed for a given projection matrix and averaged over 100 runs. In FigJJl the ratio between the minimum and 
maximum block SNR in both cases considered is set at 1.825. As observed in Fig. [Q from Fig. |2]it can be seen that 
the derived upper bound on the probability of error of the ML detector is fairly closer to the exact probability error 
obtained via Monte Carlo simulations, especially as BSNR m - m increases. Further, for a given finite BSNR m - m , 
there seems to be a considerable performance gap between the B-OMP and the ML detector. That is the price to 
pay for the computational complexity of the ML detector vs the computationally efficient B-OMP algorithm. 
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Block sparsity pattern detection, N=50, k =5, d=2, L=25 




Exact, BSNR . =13dB 
mm 



: Upper bound , BSNR . =13dB 

^ r mm 

: Upper bound, BSNR . =11 dB 

^ r mm 



• '■•eeocoaooo 



0.7 



0.8 



Fig. 2. Performance of the ML detector and the B-OMP algorithm for block sparsity pattern recovery; L — 25, fco = 5, d = 2, and thus 
k = 10, N = 50 



VI. Conclusion 

In this paper, we investigated the problem of subspace detection based on reduced dimensional samples when 
the signal of interest is assumed to lie in a general union of subspaces model. With a given sampling operator, we 
derived the performance of the optimal ML detector for subspace detection in the presence of noise in terms of the 
probability of error. We further obtained conditions that should be satisfied by the number of samples to guarantee 
asymptotic reliable subspace detection. 

We extended the analysis to a special case of union of subspaces model which reduces to block sparsity. When 
the samples are obtained via random projections, sufficient conditions required for reliable block sparsity pattern 
recovery with the ML detector were derived. Performance gain in terms of the minimum number of samples required 
for asymptotic subspace detection with the block sparse model was quantified compared to that with the standard 
sparsity model. Our results further strengthen the existing results for reliable sparsity pattern recovery with the 
standard sparsity model used in CS framework with random projections in the presence of noise. More specifically, 
our results for sufficient conditions for reliable asymptotic subspace detection are derived based on a tighter bound 
on the probability of error of the ML detector compared to the existing results in the literature with the standard 
sparsity model. We further discussed and illustrated numerically the performance gap between the ML detector and 
the computationally tractable algorithms (e.g. B-OMP) used for subspace detection with the structured union of 
subspaces model. 

As future work, we will investigate the problem of subspace detection with more structured models (in addition 
to the one considered in this paper) for the subspaces in the union. Further we will extend the analysis with the 
single node system to a multiple node system in distributed networks. 



20 

Appendix A 

Proof of Lemma Mil. 1 1 

To prove Lemma IIH.11 we consider a similar argument to that considered in ll23l with certain differences as 
noted in the following. As shown in |23l , we may write, 

A(y) = HP^ylll - HP+wHl + ||P^w||l - ||PJ-y|||. 

For any given 5 > 0, define the events 

r i|pfy|ii-i|p+wi|| i 

hi(S) = { | " jyU2 2 " % " 2 | > 5 (29) 



a w 



and 



|p_L ||2 ||p_L ||2 

h 2 (s) = \ ■i p iyii2-n p i w ii2 < 2 ^ . (30) 



Then Pr(A(y) < 0) implies that at least one event in (l29l) and (l30l ) is true. Based on the union bound, we can 
write 

Pr(A(y) < 0) < Pr(hi(S)) + Pr(h 2 (5)). 

With the standard sparsity model and assuming that the sampling is performed via random projections, upper 
bounds on the probabilities Pr{h\{5)) and Pr(h,2(S)) are derived in ll23l . In contrast, in the following, we derive 
exact value for Pr(ti2(5)) and a tighter bound for Pr{h\{5)) assuming that the sampling operator A is known. 
Thus, even for the standard sparsity model, the results presented in this paper tightens the results derived in ll23l . 

We first evaluate Pr{h\{5)). Let Ai(y) = ^-(HP^ylll — HP^wlli)- Assuming the true subspace is Sj, Ai(y) 
reduces to Ai(y) = ^-(| |F*y-w| || — HP^wH 2 ,). As shown in ll23l . the random variable Ai(y) can be represented 
as Ai(y) = x\ — x 2 where x\ and x 2 are independent and x±,x 2 ~ Xf where I is the cardinality of the set Wj\i 
as defined before. With these notations, we can write 

Pr(/ Il (<5)) = Prflxi - x 2 \ > 8) 

= Pr((xi - x 2 ) > 5) + Pr((xi - x 2 ) < -5). 

The pdf of the random variable w = x\ — x 2 is symmetric around zero and thus we have, 

Pr(hi(8)) = 2Pr((xi - x 2 ) > 5). 

Proposition VI.3. When x\ ~ Xf and x 2 ~ Xf, the random variable w = x\ — x 2 has the following pdf: 

!Jw) J /iW = ^I« 1/2 _, /2 (f); V ->0 (3i) 

where K u {x) is the modified Bessel function. 
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Proof: Since x\ and x 2 are independent, the pdf of w = x\ — x 2 is given by [46] 
, , , Jo 00 /^i( w + x 2)fx 2 {x2)dx 2 ; if w>0 

fw{W) = < ^ 

[ J^ w fx 1 (w + x 2 )f X2 (x 2 )dx 2 ; ifw<0 

First consider the case where w > 0. Then 

roo (^ + a . 2) t/2-i e -(u,+ a;2 )/2 ^/2-^-^/a 

/w(U;) J 2'/2r(Z/2) 2'/2r(|/2) 2 

X> xJ' 2 ~ 1 (T« + X2) I/2 ~ 1 e- Xa da:2 

w l/^l/2 eW /2 T{l/2)Ki/2 _ i/2{w/2) 



nv{i/2)Y j 

e~ w / 2 J. 
2<(IW2))2 0F 
V/ 2 -V 2 ^ 1/2 _ V2 (^/2) 
V^2T(Z/2) 

where K v {x) is the modified Bessel function and the third equality is obtained using the integral result f °° x" (x+ 

py-i e -»x dx = i_ (jX' 1 * eM 2 T{v)K 1/2 _ v (%f\ for //, v > in ill p. 348]. 

When w < 0, we have, 

— ui/2 /■oo 

f»W = 2 l(T(l/2W J_ X 2(™ + X 2f 2 ~ l e- X2 dx 2 . (32) 



Letting z = —w where z > 0, (1321 can be rewritten as, 



f ™ H = ¥WiW ' X " 2 ~ 1{X2 " z ^ l/2 ~ le ~ X2dX2 - (33) 



z 



Using the integral result, f u °° a"" 1 (a: - u) v ~ l e~^ x dx = -± hY eT» u / 2 T(v)K u _ 1/2 (*f) in EU p. 347] and 
the relation K u {x) = K_ u (x), we get f~(w) as in d3"TT ), completing the proof. ■ 

Proposition VI.4. For <5 > 0, the probability Pr(w > 5) is given by, 

Pr(w>5) < 2 , +1 ^ /2) ^ /2 - 1/2 ^/2-i/2(V2) 
where K u (x) is the modified Bessel function, and T(.) is the Gamma function. 
Proof: Based on (f3TT >. we have 



f°° j. f°° w l/2 - 1/2 K 1/2 _ U2 (w/2) 

Using the equivalent integral representation of K v (az) = 4j- / °° e 2 v ' ) t~ v ~ l dt B71 p. 917], we can write the 
integral in (1341 as, 

1 /'OO /'OO x / w 2\ 



Since J^ 00 e ^ dw = \/2~kQ ( -4= J , (1351 ) reduces to, 
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< — f°° t l/2 - 3/2 e~ t/4 -^dt (36) 

S l/2 - 1/2 K l/2 _ 1/2 (S/2) (37) 



2 l + l Y{l/2 



where we used the inequality Q(x) < \e~~ for x > 0, and the relation, J °° x u ~ 1 e~ l3 ^ x ~ lx dx = 2(^1 K u (2y/]3j) 
for /3 > and 7 > B71 p. 368] while obtaining (1361 ) and (T37T ). respectively, which completes the proof. ■ 

Then, we have 

Pr(hi(S)) = yf^S 112 - 112 ^-!^/!)- (38) 

Next we compute the quantity Pr(ti2{5)). Let A2(y) = ^(HP^ylll ~~ | liP^-wj ||). Then we have, 

A 2 (y) = -L(||P^B, v c jV ||l + 2w T P^B jV c jV ). 
Since w ~ A/"(0, <jJ,Im), A 2 (y) is a Gaussian random variable with pdf, 

A a (y)~JSr(^||P^B Ai c A4 ||l,A||pJ-B iNi c Ai |ll)- 

\ to w ui / 



Thus, 



Pr(/i 2 (5)) = Pr(A 2 (y)<2<5) 



1-Q 






t "jVVA*!! 2 



I 25 — A,\,- 
1-Q' 



Since it is desired to control 5 such that i-Y(/i 2 (<5)) < 1/2, we select S* = co\j\i where cq < i. With this choice 
-Pr(/i 2 (<5)) reduces to, 



Pr(h 2 (6))=Q[-J\ Ai (l-2c ) 



where we used the relation 1 — Q(—x) = Q(x) for x > 0, while Pr{h\{8)) reduces to, 

Pr(hx(S)) = ^|^( C0 A Ai ) z / 2 - 1 / 2 AV 2 _i/ 2 (c A jV /2). (39) 

Appendix B 

Proof of Theorem [2] 

To obtain conditions under which the probability of error bound in (fT4l) asymptotically vanishes, we rely on the 
following corollary. 

Corollary VI.2. Let Tq(1) and a 2 ^ t be as defined in Subsection \III.B\ The probability of error of the ML detector 
in A14i is further upper bounded by 

Pe < X>(/) (^(^onM-kw^, + A (40) 

1=1 \ ' 
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where 



fl -MM - fe)^ in ,V /2_1 e -W"-*K-,. (41) 



w/iot (M - /v)a 2 =_ , >> (Z/2 - 1/2) /or all I = 1,2,- ■■ , k an d < c < 1/2. 



4T(//2) V4 
w n in 

2 

Proof: Using the Chernoff bound for the Q function where Q(x) < ~e ~, we can upper bound the term 

Q(i(l-2co)^(M-A;)a^ nii )as, 

for co < 2- 

To obtain (14Tb we used the relation K v (z) ~ \f^ e ~ z when z/ << z, completing the proof. ■ 

It is further noted that when k is fairly small and a^ in ; is sufficiently large, the condition required for d4~il is 

often satisfied. We consider the conditions under which the each term in (|40T > goes to asymptotically, equivalently 

logarithm of each term — > — oo. First consider the first term in the summation in d4fJi for which the logarithm gives, 

logTo(Z) + log(l/2) - i(l - 2c ) 2 (M - k)al mil 

< max |log(T (0) + log(l/2) - i(l - 2c ) 2 (M - k) {a^,}} -> "°° 

as (M - fe) -> oo when M > k + M 1 where Mi = max { (i_ 2eo )^ — ( lo g( r o(0) + logU/ 2 )}}- Considering 
the second term in (|40l i, let 



n x = JogToCO + log (^^^yj + (V2 " 1) log (jco(M - fc)o4in,iJ - -c (Af - fc)^, (42) 

where &o = "4"- When \cq{M — £O a min/ is sufficiently large, we can find < qo < nn^j] sucn th at 
log (\cq(M - fc) a min,n < Qo^c (M - fc)a 2 ninr Then (02]) is upper bounded by 



Ui < i max fc |log(T (0) + log (f^y) " Q C °( M ~ k )<i*,l) ^ " ?o(fc/2 ~ l ))) = n 2 (43) 

where < qo < i k /2-\) - We can write go in the form of qo = 2 (k/2+r -1) ^ or some r o > 0. Thus, (l43l ) can be 
rewritten as 



n 2 = maxllog{T (l))+hg(^)-(-c {M-k)a 2 miTl ,) , >/0 : > -> -00 



,=w p iow T 1U * VT^J " V2 Co(M " fc)a ^ n 'V ro + I/2-1 

as (M-Jfe) -»• 00 when Af > fc+M 2 where M 2 = max J y+^-i) (i og (r (/)) + i og (ih) U, < c < 1/2, 
b = *f , and r > 0. 
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